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10 

FIELD OF THE INVENTION 
The subject invention is related to inhomogeneous time series analysis, and more 
particularly to the analysis of high-frequency financial data, such as foreign exchange data. 

15 BACKGROUND 

Time series are the common mathematical framework used to represent the world of 
economics and finance. Among time series, the first important classification can be done 
according to the spacing of data points in time. Regularly spaced time series are called 
homogeneous, irregularly spaced series are inhomogeneous. An example of a homogeneous 
20 time series is a series of daily data, where the data points are separated by one day (on a 
business time scale, which omits the weekends and holidays) . 

In most references on time series analysis, the time series to be treated are restricted to 
the filed of homogeneous time series {see, e.g., Granger C.W.J, and Newbold P., 1977, 
Forecasting economic time series, Academic Press, London; Priestley M.B., 1989, Non- 
25 linear and non-stationary time series analysis, Academic Press, London; Hamilton J.D., 
1994, Time Series Analysis, Princeton University Press, Princeton, New Jersey) (hereinafter, 
respectively, Granger and Newbold, 1977; Priestley, 1989; Hamilton, 1994). This restriction 
induces numerous simplifications, both conceptually and computationally, and was justified 
before fast, inexpensive computers and high-frequency time series were available. 
30 Current empirical research in finance is confronted with an ever-increasing amount of 

data, caused in part by increased computer power and communication speed. Many time 
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series can be obtained at high frequency, often at market tick-by-tick frequency. These time 
series are inhomogeneous, since market ticks arrive at random times. Inhomogeneous time 
series by themselves are conceptually simple; the difficulty lies in efficiently extracting and 
computing information from them. 

5 

SUMMARY 

There is thus a need for methods of analyzing inhomogeneous time series. Time 
series based on foreign exchange rates represent a standard example of the practical 
application of such methods. In practice, the methods described herein are also suitable for 

10 applications to homogeneous time series. Given a time series z, such as an asset price, the 
general point of view is to compute another time series, such as the volatility of the asset, by 
the application of an operator Q\z\ There is a need for a method of applying a set of basic 
operators that can be combined to compute more sophisticated quantities (for example, 
different kinds of volatility or correlation). In such a method, a few important considerations 

1 5 must be kept in mind. First, the computations must be efficient. Even if powerful computers 
are becoming cheaper, typical tick-by-tick data in finance is 100 or even 10,000 times denser 
than daily data. Clearly, one cannot afford to compute a full convolution for every tick. The 
basic workhorse is the exponential moving average (EMA) operator (described below), which 
can be computed very efficiently through an iteration formula. A wealth of complex but still 

20 efficient operators can be constructed by combining and iterating the basic operators 
described. 

Second, stochastic behavior is the dominant characteristic of financial processes. For 
tick-by-tick data, not only the values but also the time points of the series are stochastic. In 
this random world, point-wise values are of little significance and we are more interested in 

25 average values inside intervals. Thus the usual notion of return also has to be changed. With 
daily data, a daily return is computed as r,= p,~P,-\ > i- e -> as a point-wise difference between 
the price today and the price yesterday. With high-frequency data, a better definition of the 
daily return is the difference between the average price of the last few hours and an average 
price from one day ago. In this way, it is possible to build smooth variables well-suited to 

30 random processes. The calculus has to be revisited in order to replace point- wise values by 
averages over some time intervals. 
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Third, analyzing data typically involves a characteristic time range; a return r[z], for 
example, is computed on a given time interval t. With high-frequency data, this characteristic 
time interval can vary from few minutes to several weeks. We have been careful to make 
explicit all these time range dependencies in the formulation of operators used in the 
5 described methods. 

Finally, we often want smooth operators. Of course, there is a singularity at t = now, 
corresponding to the arrival of new information. This new information must be incorporated 
immediately, and therefore, the operators may have a jump behavior at t = now. Yet, aside 
from this fundamental jump created by the advance of events, it is better to have continuous 
10 and smooth operators. A simple example of a discontinuous operator is an average with a 
rectangular weighting function, say of range t. The second discontinuity at now-x, 
corresponding to forgetting events, is unnecessary and creates spurious noise. Instead, a 
preferred embodiment uses moving average weighting functions (kernels) with a smooth 
decay to zero. 

1 5 The above-listed goals are satisfied by the subject invention. A preferred embodiment 

comprises a method to obtain predictive information (e.g., volatility) for inhomogeneous 
financial time series. Major steps of the method comprise the following: (1) financial market 
transaction data is electronically received by a computer over an electronic network; (2) the 
received financial market transaction data is electronically stored in a computer-readable 

20 medium accessible to the computer; (3) a time series z is constructed that models the received 
financial market transaction data; (4) an exponential moving average operator is constructed; 
(5) an iterated exponential moving average operator is constructed that is based on the 
exponential moving average operator; (6) a time-translation-invariant, causal operator fl[z] is 
constructed that is based on the iterated exponential moving average operator; (7) values of 

25 one or more predictive factors relating to the time series z and defined in terms of the operator 
Q[z] are calculated by the computer; and (8) the values calculated by the computer are stored 
in a computer readable medium. 

Various predictive factors are described below, and specifically comprise return, 
momentum, and volatility. Other predictive factors will be apparent to those skilled in the 

30 art. 
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The above briefly described embodiment is only one of several preferred 
embodiments described herein, and should not be interpreted as representing the invention as 
a whole, or as the "thrust' of the invention. Descriptions of other, equally important, 
embodiments have been omitted from this Summary merely for conciseness. Of particular 
5 note is the fact that the described method is applicable to any time series data, not just FX 
data. 

BRIEF DESCRIPTION OF THE DRAWINGS 
FIG. 1 is a graph of the foreign exchange rate for USD/CHF for the week from 
10 Sunday, October 26, to Sunday, November 2, 1997. 

FIG. 2 is a graph of a kernel ma[x, n](f) for n = 1 , 2, 4, 8, and 1 6, where x = 1 . 
FIG. 3 is a graph on a logarithmic scale of the kernel ma[x, n](t) for 1, 2, 4, 8, and 
16, where x == 1. 

FIG. 4 is a graph of a schematic differential kernel. 
1 5 FIG. 5 is a graph of a differential operator A[t] , for x = 1 . 

FIG. 6 is a graph on a logarithmic scale of the absolute value of the differential 
operator A[t] , for x = 1 . 

FIG. 7 illustrates a comparison between the differential computed using the formula 
A[t] = y (EMA[oit, 1] + EMA[«t, 2] - 2 EMAfot^r, 4]), 
20 with x = 24 hours ("24h"), and the point-wise return x(f) - x(7-24h). 

FIG. 8 is a graph of an annualized derivative D [x, y = 0.5; x] for USD/CHF from 1 
Jan 1988 to 1 Nov 1998. 

FIG. 9 shows an annualized volatility computed as MNorm [x/2; D [x/32, y = 0.5; x]] 
withx= lh. 

25 FIG. 10 shows plots of a standardized return, a moving skewness, and a moving 

kurtosis. 

FIG. 11 plots a kernel wf(f) for a windowed Fourier operator, for n = 8 and k = 6. 
FIG. 12 shows a plot of a normed windowed Fourier transform for the example week, 
with x = 1 hour, k = 6, and n =8. 
30 FIG. 13 illustrates major steps of a preferred embodiment. 

FIG. 14 illustrates major steps of a second preferred embodiment. 
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FIG. 



FIG. 



15 illustrates major steps of a third preferred embodiment. 

16 illustrates major steps of a fourth preferred embodiment. 



5 



DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 



1 



Introduction 



The generalization to inhomogeneous time series introduces a number of technical 
peculiarities. Because of their time-translation invariance, all macroscopic operators can be 

10 represented by convolutions. A convolution is defined as an integral, so the series should be 
defined in continuous time. Actual data is known only at discrete sampling times, so some 
interpolation needs to be used in order to properly define the convolution integral. The same 
problem is present when constructing an artificial homogeneous time series from 
inhomogeneous data. Another technical peculiarity originates from the fact that our 

15 macroscopic operators are ultimately composed of iterated moving averages. All such EMA 
operators have non-compact kernels: the kernels decay exponentially fast, but strictly 
speaking they are positive. This implies an infinite memory; a build-up must be done over an 
initialization period before the value of an operator becomes meaningful. All the above 
points are discussed in detail below. 

20 High-frequency data in finance has a property that creates another technical difficulty: 

strong intra-day and intra- week seasonalities, due to the daily and weekly pattern of human 
activities. A powerful deseasonalization technique is needed, such as a transformed business 
time scale {see Dacorogna, M.M., Miiller, U.A., Nagler, R.J., Olsen, R.B., and Pictet, O.V., 
1993, A geographical model for the daily and weekly seasonal volatility in the FX market, 

25 Journal of International Money and Finance, 12(4), 413-438.) (hereinafter Dacorogna et al., 
1993). Essentially, this scale is a continuous-time generalization of the familiar daily 
business time scale (which contains five days per week, Saturdays and Sundays omitted). A 
continuous business time scale 6 allows us to map a time interval dt in physical time to an 
interval dd in business time, where dd/dt is proportional to the expected market activity. All 

30 the techniques presented in this paper can be based on any business time scale. The required 
modification is to replace physical time intervals with corresponding business time intervals. 
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As this extension is straightforward to those skilled in the art, all the formulae are given in 
physical time and a few remarks on scaled time are made when the extension or its 
consequences are nonobvious. 

The plan of this description is as follows: The notation is fixed in Section 2 and the 
5 main theoretical considerations are given in Section 3. A set of convenient macroscopic op- 
erators, including different moving averages and derivatives are given in Section 4. Armed 
with powerful basic operators, it is then easy to introduce novel methods of calculating time 
series predictive factors such as moving volatility, correlation, moving skewness and kurtosis, 
and to generalize the described methods to complex-valued operators. In Section 5, we 

1 0 describe preferred implementations of the method. 

Examples are given with data taken from the foreign exchange (FX) market. When 
not specified, the data set is USD/CHF for the week from Sunday, October 26, 1997 to Sun- 
day, November 2. This week has been selected because on Tuesday, October 28, 1997, some 
Asian stock markets crashed, causing turbulences in many markets around the world, 

1 5 including the FX market. Yet, the relation between a stock market crash originating in Asia 
and the USD/CHF foreign exchange rate is quite indirect, making this example interesting. 
The prices of USD/CHF for the example week are plotted in FIG. 1. All the figures for this 
week have been computed using high-frequency data; the results have finally been sampled 
each hour using a linear interpolation scheme. The computations have been done in physical 

20 time, therefore exhibiting the full daily and weekly seasonalities contained in the data. FIG. 
1 shows the FX rate of USD/CHF (U.S. dollars to Swiss Francs) for the week of Sunday, 
October 26 to Sunday, November 2, 1997. On the time axis, the labels correspond to the day 
in October, with the points 32 and 33 corresponding to November 1 and 2. From the market 
quote containing bid and ask prices, the (geometric) middle price was computed as 

25 J bid -ask. 

Finally, we want to emphasize that the techniques described herein can be applied to a 
wide range of statistical computations in finance ~ for example, the analysis needed in risk 
management. A well-known application can be found in (Pictet O.V., Dacorogna M.M., 
Miiller U.A., Olsen R.B.„ and Ward J.R., 1992, Real-time trading models for foreign 
30 exchange rates, Neural Network World, 2(6), 713-744.) (hereinafter Pictet et al., 1992). 
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Further, these techniques can also be applied to any time series data (e.g., commodity prices 
or temperature data), not just financial data. 

2 Notation and Mathematical Preliminaries 

5 The letter z is used herein to represent a generic time series. The elements, or ticks, (t, 

, z,) of a time series z consist of a time t, and a scalar value z r The generalization to 
multivariate inhomogeneous time series is straightforward (except for the business time scale 
aspect). The value z, = z (t t ) and the time point t, constitute the z'-th element of the time series 
z. The sequence of sampling (or arrival) times is required to be growing: 

10 t/> The strict inequality is required in a true univariate time series and is theoretically 
always true if the information arrives through one channel. In practice, the arrival time is 
only known with a finite precision (say, one second), and two ticks may well have the same 
arrival time. Yet, for most of the methods described herein, the strict monotonicity of the 
time process is not required. A general time series is inhomogeneous, meaning that the 

1 5 sampling times are irregular. For an homogeneous time series, the sampling times are 

regularly spaced: - t tA = 5/ = constant. If a time series depends on some parameters 6, these 
are made explicit between square brackets, z[6] . 

An operator Q from the space of time series into itself is denoted by Q[z]. The 
operator may depend on some parameters Q\0 ; z]. The value of Q[z] at time t is Q[z](f). For 

20 linear operators, a product notation Hz is also used. The average over a whole time series of 
length T is denoted by E[z]:=l/T \ dt z(t). The probability density function (pdf) of z is 
denoted p(z). A synthetic regular (or homogeneous) time series (RTS), spaced by 8t, derived 
from the irregular time series z, is noted RTS[<5/;z]. A standardized time series for z is 
denoted z = ( z - E[z])/a[z], where a[z] 2 = E[(z - E[z]) 2 ]. 

25 The letter x is used to represent the logarithmic middle price time series x = (In p hld + 

ln^/2 = In ^p bid p ask . 
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3 



Convolution Operators: General Considerations 



3.1 Linear operators 

If an operator is linear, time-translation invariant and causal, it can be represented by a 
convolution with a kernel co(f): 



5 




(1) 



The kernel u>(f) is defined only on the positive semi-axis t > 0, and should decay to t large 
1 0 enough. With this convention for the convolution, the weight given to past events 
corresponds to the value of the kernel for positive argument. The value of the kernel 
(i>(t-t ^ is the weight of events in the past, at a time interval t - f'from t. In this convolution, 
z{t') is a continuous function of time. Actual time series z are known only at the sampling 
time f, and should be interpolated between sampling points. Many interpolation procedures 
1 5 for the value of z(t) between t lA and t, can be defined, but three are used in practice: previous 
value z(r) = z iA , next value z(r) = z, , and linear interpolation z(t) = a(r)z,_, + [1 - a(f)]z ; with 
a(0 = a-ry(f, -f H ). 

The linear interpolation leads to a continuous interpolated function. Moreover, linear 
interpolation defines the mean path of a random walk, given the start and end values. 

20 Unfortunately, it is non-causal, because in the interval between t lA and t„ the value at the end 
of the interval z, is used. Only the previous-value interpolation is causal, as only the 
information known at r M is used in the interval between and t,. Any interpolation can be 
used for historical computations, but for the real-time situation, only the causal previous- 
value interpolation is defined. In practice, the interpolation scheme is almost irrelevant for 

25 good macroscopic operators — i.e., if the kernel has a range longer than the typical sampling 
rate. 

The kernel a(t) can be extended to all t e R, with g)(?) = 0 for t < 0. This is useful 
for analytical computation, particularly when the order of integral evaluations has to be 
changed. If the operator Q is linear and time-translation invariant but non-causal, the same 
30 representation can be used except that the kernel may be non-zero on the whole time axis. 
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Two broad families of operators that share general shapes and properties are often 
used. An average operator has a kernel that is non-negative, co(f) > 0, and normalized to 
unity, J dt cd(7) = 1 . This implies that Q[parameters; Const] = Const. Derivative and 
difference operators have kernels that measure the difference between a value now and a 
5 value in the past (with a typical lag of x). Their kernels have a zero average J" dt o>(t) = 0, 
such that n[parameters;Const] = 0. 

The integral (1) can also be evaluated in scaled time. In this case, the kernel is no 
more invariant with respect to physical time translation (i.e., it depends on t and t ') but it is 
invariant with respect to translation in business time. If the operator is an average or a 
10 derivative, the normalization property is preserved in scaled time. 

3.2 Range and width 

The n-th moment of a causal kernel to is defined as 

(t%=f~dt a(t)t n . (2) 

15 

The range r and the width w of an operator Q are defined respectively by the following 
relations 

r [Q] = (t) = f°°dt (Q(i)t, 
Jo 

(3) 

2Q w 2 [Q] = ((t-r)\ = f'dt <»(t){t-r) 2 . 

For most operators Q,[x] depending on a time range x, the formula is set up so that 

25 3.3 Convolution of kernels 

A standard step is to successively apply two linear operators: 

Q.[z]=Q 2 oQ 1 [z] = Q 2 Q 1 z := Oj^Jz]] . 

30 It is easy to show that the kernel of Q c is given by the convolution of the kernels of Oj and 
Q 2 : 
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co c = coj ★ ca 2 or to c (t-t / ) = dt"G> l {t-t l! )& 2 (t"-t l ) (4) 

or, for causal operators, 

o c (f) = V a dt 1 ^ - t<) co 2 (/ / + ±) for f>0, (5) 
J -til 2 1 

5 

and <o c (0 = 0 for t < 0. Under convolution, range, width, and second moment obey the 
following simple laws: 

r c = r i + r 2 > 

2 2 2 ,„ 

w = w, + , (6) 

10 

{/ 2 ) c = it\ + (/ 2 ) 2 + 2r t r 2 . 



3.4 Build-up time interval 

Since the basic building blocks of a preferred embodiment are EMA operators, most 
15 kernels have an exponential tail for large t. This implies that, when starting the evaluation of 
an operator at time T, a build-up time interval must elapse before the result of the evaluation 
is "meaningful," (i.e., the initial conditions at Tare sufficiently forgotten). This heuristic 
statement can be expressed by quantitative definitions. We assume that the process z(t) is 
known since time T, and is modeled before Tas an unknown random walk with no drift. The 
20 definition (1) for an operator Q, computed since T needs to be modified in the following way: 

n[T;z](t) = f'dt'vit-t') z(t) . (7) 

The "infinite" build-up corresponds to z](t) . For -T < 0, the average build-up 

25 error e at t = 0 is given by 

e 2 = Em[-T;z](0) - Q[-oo;z](0)) 2 ] = f ^df ®{-t >)z{t '))^ 2 ] ( 8 ) 

where the expectation E[] is an average on the space of processes z. For a given build-up 
error 8, this equation is the implicit definition of the build-up time interval T. In order to 
30 compute the expectation, we need to specify the considered space of random processes. We 
assume simple random walks with constant volatility a, namely 
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E[(z(t) - z{t + 50) 2 ]=a-^. (9) 

The symbol ly denotes one year, so 8t/ly is the length of dt expressed in years. With this 
choice of units, a is an annualized volatility, with values roughly from 1% (for bonds) to 50% 
5 (for stocks), and a typical value of 10% for foreign exchange. For t<-T,t'< -T, we have the 
expectation 

E[z(t)z(t')] = z(-7) 2 + a m in(^±^,^-^). (10) 

Having defined the space of processes, a short computation gives 
10 e 2 = z(-T) 2 ^ f'dt Q(t)y + 2a J~dt <a(t) f^dt f &{t) t —L, (11 ) 

The first term is the "error at initialization," corresponding to the decay of the initial value 
Q[-7](-7) = 0 in the definition (7). A better initialization is Q\-T\(-T) = z(-T) J" "©(0 , 
corresponding to a modified definition for Q[T\(t): 
15 Q\T;z](i) = z(-T) J T dt , ay(t-t / ) + f ' 'dt 1 co(/ - 1 ') z(t ') . (12) 

Another interpretation for the above formula is that z is approximated by its most 
probable value z(-T) for t<T. With this better definition for Q, the error reduces to 

~J 



2o f'dt co(0 f dt'(A{ty- — -. (13) 
J t J t ly 



20 

For a given kernel oo, volatility o and error s, eq. (13) is an equation for T. Most of the 

kernels introduced in the next section have the scaling form g>(t, t) = &(th)/x . In this case, 
~ T 

the equation for T= — reduces to 
x 

s 2 = 2a— f" dt ra(0 Vdt' &(t') (t'-f). (14) 
ly Jf Jf 

25 

Since this equation cannot be solved for general operators, the build-up interval should be 
computed numerically. This equation can be solved analytically for the simple EMA kernel, 
and gives the solution for the build-up time: 

(15) 



2 [2 ly) 
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As expected, the build-up time interval is large for a small error tolerance and for processes 
with high volatility. 

For operators more complicated than the simple EMA, eq. (14) is, in general, not 
solvable analytically. A simple rule of thumb can be given: the fatter the tail of the kernel, 
5 the longer the required build-up. A simple measure for the tail can be constructed from the 
first two moments of the kernel as defined by eq. (2). The aspect ratio AR[Q] is defined as 

(t 2 )f 

AR[0\ = (16) 

10 Both (t) and measure the extension of the kernel and are usually proportional to x ; thus 
the aspect ratio is independent of x (the "width" of the moving "window" of data over which 
the EMA is "averaged") and dependent only on the shape of the kernel, in particular its tail 
property. Typical values of this aspect ratio are 2/^3 for a rectangular kernel and\/2 for a 
simple EMA. A low aspect ratio means that the kernel of the operator has a short tail and 

15 therefore a short build-up time interval in terms of x. This is a good rule for non-negative 
causal kernels; the aspect ratio is less useful for choosing the build-up interval of causal 
kernels with more complicated, partially negative shapes. 



3.5 Homogeneous operators 

20 There are many more ways to build non-linear operators; an example is given in 

Section 4.8 for the (moving) correlation. In practice, most non-linear operators are 
homogeneous of degree p, namely Cl[ax] = \af Q.[x] (here the word "homogeneous" is used in 
a sense different from that in the term "homogeneous time series"). Translation-invariant 
homogeneous operators of degree pq take the simple form of a convolution: 

25 

O[z](0 = ^Jt^it-t 1 ) |z(f')rj* (17) 

for some exponents p and q. An example is the moving norm (see Section 4.4) with g> 
corresponding to an average and q = lip. 

30 
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3.6 Robustness 

Data errors (outliers) should be filtered prior to any computation. Outlier filtering is 
difficult and sometimes arbitrary for high-frequency data in finance; this data is stochastic 
with a fat-tailed distribution of price changes (see Pictet O.V., Dacorogna M.M., and Muller 

5 U.A., Hill, bootstrap and jackknife estimators for heavy tails, in "A practical guide to heavy 
tails: Statistical Techniques for Analysing Heavy Tailed Distributions," edited by Robert J. 
Adler, Raisa E. Feldman and Murad S. Taqqu, published by Birkhauser, Boston 1998) 
(hereinafter Pictet et al, 1 998). Sometimes it is desirable to build robust estimators to reduce 
the impact of outliers and the choice of the filtering algorithm. The problem is acute mainly 

10 when working with returns, for example when estimating a volatility, because the difference 
operator needed to compute the return r from the price x is sensitive to outliers. The 
following modified operator achieves robustness by giving a higher weight to the center of 
the distribution of returns r than to the tails: 

QU;r] =r 1 {Q[Kr)]} ( 18 > 

15 

where/is an odd function over R. Possible mapping functions fix) are 

sign(x)|xP = xM H , ( 19 ) 
sign (x) (this corresponds to y - 0 in the above formula), (20) 

tanh(x/x 0 ). (21) 

20 

Robust operator mapping functions defined by eq. (19) have an exponent 0 < y < 1 . In 
some special applications, operators with y > 1, emphasizing the tail of the distribution, may 
also be used. In the context of volatility estimates, the usual L 2 volatility operator based on 
squared returns can be made more robust by using the mapping function /= sign(x) J\x\ (the 
25 signed square root); the resulting volatility is then based on absolute returns as in eq. (39). 
More generally, the signed power fix) = sign(x)|xf transforms an L 2 volatility into an L 2p 
volatility. This simple power law transformation is often used and therefore included in the 
definition of the moving norm, moving variance, or volatility operators, eq. (32). Yet, as will 
be apparent to those skilled in the art, more general transformations can also be used. 

30 
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4 The menagerie of convolution operators 
4.1 Exponential moving average EMAjx] 

The basic exponential moving average (EMA) is a simple average operator, with an 
exponentially decaying kernel: 

5 

-th 

ema(/) = - — . (22) 
x 

This EMA operator is our foundation. Its computation is very efficient, and other more 
complex operators can be built with it, such as MAs, differentials, derivatives and volatilities. 
10 The numerical evaluation is efficient because of the exponential form of the kernel, which 
leads to a simple iterative formula: 

EMA[r,z](g = m EMA(x;z](^_ 1 ) + (y - ft) z n _ x + (1 -v)z„,with 

a = — - — , (23) 
15 M=e -, 

and where v depends on the chosen interpolation scheme, 

(1 previous point 

(l-ju)/a linear interpolation (24) 

fi next point 

Thanks to this iterative formula, the convolution never needs to be computed in practice; only 
a few multiplications and additions have to be done for each tick. In section 4.10, the EMA 
operator is extended to the case of complex kernels. 

25 4.2 The iterated EMA [x, n] 

The basic EMA operator can be iterated to provide a family of iterated exponential 
moving average operators EMA [x, n]. A simple recursive definition is 

EMA [x, n; z] = EMA[x; EMA[x, n - l;z]] (25) 
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with EMA[x, 1; z]= EM A [x, z]. This definition can be efficiently evaluated by using the 
iterative formula (23) for all these basic EMAs. There is a non-obvious complication related 
to the choice of the interpolation scheme (24). The EMA of z necessarily has an interpolation 
scheme different from that used for z. The correct form of EMA [t; z] between two points is 
5 no longer a straight line but a non-linear (exponential) curve. It will be straightforward to 
those skilled in the art to derive the corresponding exact interpolation formula. When one of 
the interpolation schemes of eq. (24) is used after the first iteration, a small error is made. 
Yet, if the kernel is wide as compared to t„ - t nA , this error is indeed very small. As a suitable 
approximation, a preferred embodiment uses linear interpolation in the second and all further 
10 EMA iterations, even if the first iteration was based on the next-point interpolation. The only 
exception occurs if z„is not yet known; then we need a causal operator based on the previous- 
point interpolation. 



This family of functions is related to Laguerre polynomials, which are orthogonal with 
respect to the measure e' (for x = 1). Through an expansion in Laguerre polynomials, any 
kernel can be expressed as a sum of iterated EMA kernels. Therefore, the convolution with 

20 an arbitrary kernel can be evaluated by iterated exponential moving averages. Yet, the 
convergence of this expansion may be slow, namely high-order iterated EMAs may be 
necessary, possibly with very large coefficients. This typically happens if one tries to 
construct operators that have a decay other (faster) than exponential. Therefore, in practice, 
we construct operators "empirically" from a few low-order EMAs, in a way to minimize the 

25 build=up time. The set of operators provided by this description covers a wide range of 
computations needed in finance. 

The range, second moment, width, and aspect ratio of the iterated EMA are, 
respectively, 



The kernel of EMA[x, ri] is 



15 




(26) 



r = nx , 



30 



it 2 ) = nin + ^x 2 , 
w 2 = nx 2 , 



(27) 



AR = J(n + l)n . 
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The iterated EMA[r, ri] operators with large n have a shorter, more compact kernel 
and require a shorter build-up time interval than a simple EMA of the same range nx. This is 
indicated by the fact that the aspect ratio AR decreases toward 1 for large n. Each basic EMA 
operator that is part of the iterated EMA has a range x which is much shorter than the range nx 
5 of the full kernel. Even if the tail of the kernel is still exponential, it decays faster due to the 
small basic EMA range x. 

In order to further improve our preferred method, we build another type of compact 
kernel by combining iterated EMAs, as shown in the next section. As the iterated EMAs, 
these combined iterated EMAs have a shorter build-up time interval than a simple EMA of 
10 the same range. 



4.3 Moving average MAjr, «] 

A very convenient moving average operator is provided by 

15 MA[x,«]=- Y) EMA[t/,A:], with %' = . (28) 

n k =i n + 1 

The parameter x' is chosen so that the range of MA[x, n] is r = x, independent of n. This 
provides a family of more rectangular-shaped kernels, with the relative weight of the distant 
past controlled by n. Kernels for different values of n and x = 1 are shown in FIG. 2, where 
20 ma[x,n](X)is plotted for n = 1, 2, 4, 8, and 16, with x = 1. The kernels' analytical form is 
given by 

ma[z ,„ m , aili^f ±(±Y. (29) 

n 2x k =o kl [ x'j 

25 For n = °°, the sum corresponds to the Taylor expansion of exp(//x'), which cancels the 

term exp {-tlx') in (29), making the kernel constant. For finite n, when tlx' is small enough, 
the finite sum will be a very good approximation of exp {-tlx'). "Small enough" means that 
the largest term in the sum is of order one: {tlx')" /nl ~ 1 . For large n, the condition {tlx')"/n\ 
~ 1 corresponds to t ~ 2x (using Stirling's approximation nl ~ n"). Therefore, for t « 2x, the 

30 series approximates well the Taylor expansion of an exponential: 
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This explains the constant behavior of the kernel for t « 2t. For t > 2x large, the 
exponential always dominates and the kernel decays to zero. Therefore, for large n, this 
operator tends to a rectangular moving average for which AR = 2/\f3. For values of n > 5, the 
kernel is rectangular-like more than EMA-like; this can as seen in FIG. 2. 
10 The decay of MA kernels is shown in FIG. 3, where ma[x,n](t)is plotted on a logarithmic 
scale, for n = 1, 2, 4, 8, and 16, with x = 1 . The aspect ratio of the MA operator is 



4 (" +2 ) (30) 

3(>2+l) 

1 5 Clearly, the larger n, the shorter the build-up. 

This family of operators can be extended by "peeling off some EMAs with small k: 



MA[T ) » faP «J= £ EMA[t',*] 



20 and with 1 < n M < n sap . By choosing such a modified MA with n mf > 1, we can generate a lag 
operator with a kernel whose rectangular-like form starts after a lag rather than immediately. 
This is useful in many applications that will be clear to those skilled in the art. 

In almost every case, a moving average operator can be used instead of a sample 
average. The sample average of z(t) is defined by 

25 

E[z] = _±-_ f^dt'z(t) (31) 

where the dependency on start-time t s and end-time the is implicit on the left-hand side. This 
dependency can be made explicit with the notation E[/ f , - t s ; z]( the ), thus demonstrating the 
30 parallelism between the sample average and a moving average MA[2x; z](t). The conceptual 
difference is that when using a sample average, t s and the are fixed, and the sample average is a 
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functional from the space of time series to R, whereas the MA operator produces another time 
series. Keeping this difference in mind, we can replace the sample average E[-] by a moving 
average MA[-]. For example, we can construct a standardized time series £ (as defined in 
Section 2), a moving skewness, or a moving correlation (see the various definitions below). 
5 Yet, sample averages and MAs can behave differently: for example, E [(z - E [z]f] = E [z 2 ] - 
E [z] 2 , whereas MA [(z - MA [z]) 2 ] * MA [z 2 ] - MA [z] 2 . 

4.4 Moving norm, variance and standard deviation 

With the efficient moving average operator, we define the moving norm, moving 
10 variance, and moving standard deviation operators, respectively: 

MNorm[T,jD;z] = MAfr;^] 1 *, 
MVa.r[z,p;z] = MA[r;|z - MA[r,z]f] , (32) 
MSD[t, p;z] = MA[r;|z - MA[r,z]n I/p . 

15 

The norm and standard deviation are homogeneous of degree one with respect to z. The 
p-moment pi p is related to the norm by ju p = MA [\zf = MNorm [zf. Usually,/? = 2 is taken. 
Lower values for p provide a more robust estimate (see Section 3.6), and p = 1 is another 
common choice. Even lower values can be used, for example p = X A. 
20 In the formulae for MVar and MSD, there are two MA operators with the same range x and 
the same kernel. This choice is in line with standard practice: empirical means and variances 
are computed for the same sample. Other choices can be interesting — for example, the 
sample menu can be estimated with a longer time range. 

25 4.5 Differential A[x] 

As mentioned above, a low-noise differential operator suitable to stochastic processes 
should compute an "average differential," namely, the difference between an average around 
time "now" over a time interval x l and an average around time "now - t" on a time interval 
x 2 . The kernel may look like the schematic differential kernel plotted in FIG. 4. 
30 Usually, x, x t , and x 2 are related and only the x parameter appears, with x x ~ x 2 ~ x/2. 

The normalization for A is chosen so that A [x; c] = 0 for a constant function c=c(t) = 
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constant, and A [x; f] = x. Note that our point of view is different from that used in 
continuous-time stochastic analysis. In continuous time, the limit x - 0 is taken, leading to 
the Ito derivative with its subtleties. In our case, we keep the range x finite in order to be able 
to analyze a process at different time scales (i.e., for different orders of magnitudes of x). 
5 Moreover, for financial data, the limit t - 0 cannot be taken because a process is known only 
on a discrete set of time points (and probably does not exist in continuous time). 
The following operator can be selected as a suitable differential operator: 

A[t] = y (EMA[ot,1] + EMA[ar, 2] - 2 EMA[aySr,4]) (33) 

10 

with y = 1.22208, p = 0.65 and a" 1 = y(8p - 3). This operator has a well-behaved kernel that 
is plotted in FIG. 5, wherein the full line 510 is the graph of the kernel of the differential 
operator A[t] , for x = 1 ; the dotted curve 520 corresponds to the first two terms 
y(EMA[ar, 1] + EMA[oct, 2]) ; and the dashed curve 530 corresponds to the last term 

15 2y EMA[ct fir, 4] . The value of y is fixed so that the integral of the kernel from the origin to 
the first zero is one. The value of a is fixed by the normalization condition and the value of (3 
is chosen in order to get a short tail. 

The tail can be seen in FIG. 6, which shows the kernel of the differential operator 
A[r] , plotted in a logarithmic scale. The dotted line 610 shows a simple EMA with range x, 

20 demonstrating the much faster decay of the differential kernel. After t = 3.25x, the kernel is 
smaller than 10" 3 , which translates into a small required build-up time of about 4x. 

In finance, the main purpose of a A operator is computing returns of a time series of 
(logarithmic) prices x with a given time interval x. Returns are normally defined as changes 
of x over x; the alternative return definition r[x] = A [x;x] is used herein. This computation 

25 requires the evaluation of 6 EMAs and is therefore efficient, time-wise and memory-wise. 
An example using our "standard week" is plotted in FIG. 7, demonstrating the low noise 
level of the differential. FIG. 7 illustrates a comparison between the differential computed 
using the formula (33) with x = 24 hours ("24h") (the solid line 710), and the point-wise 
return x(t) - x(t-24h) (the dotted line 720). The time lag of approximately 4 hours between 

30 the curves is essentially due to the extent of both the positive part of the kernel (0 < t < 0.5) 
and the tail of the negative part (t > 1 .5). 
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The conventionally computed return r[x](t) = x(f) - x(t - r) is very inefficient to 
evaluate for inhomogeneous time series. The computation of x(t - r) requires many old /„ x f 
values to be kept in memory, and the t, interval bracketing the time t-x has to be searched 
for. Moreover, the number of ticks to be kept in memory is not bounded. This return 
5 definition corresponds to a differential operator kernel made of two 5 functions (or to the 
limit T ls x 2 -0 of the kernel in FIG. 4). The quantity x(t) - x(t - x) can be quite noisy, so a 
further EMA might be taken to smooth it. In this case, the resulting effective differential 
operator kernel has two discontinuities, at 0 and at x, and decays exponentially (much slower 
than the kernel of A[xpc]). Thus it is cleaner and more efficient to compute returns with the A 

10 operator of eq. (33). 

Another quantity commonly used in finance is x - EMA[x;x], often called a 
momentum or an oscillator. This is also a differential with the kernel 8(f) - exp (- t/x)/x, with 
a 8 function at t = 0. A simple drawing shows that the kernel of eq. (33) produces a much 
less noisy differential. Other appropriate kernels can be designed, depending on the 

15 application. In general, there is a trade-off between the averaging property of the kernel and a 
short response to shocks of the original time series. 



Derivative D[t] and y-Derivative D[r, y] 

The derivative operator 

DM - ^El (34) 



behaves exactly as the differential operator A[t] , except for the normalization D[x; t] = l. 
This derivative can be iterated in order to construct higher order derivatives: 

D 2 [x] = D[x;D[x]] . (35) 



The range of the second-order derivative operator D 2 is 2x. More generally, the n~th order 
derivative operator D", constructed by iterating the derivative operator n times, has a range nx. 
30 As defined, the derivative operator has the dimension of an inverse time. It is easier to work 
with dimensionless operators, and this is done by measuring x in some units. One year 
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provides a convenient unit, corresponding to an annualized return when D[x]x is computed. 
The choice of unit is denoted by illy, meaning that x is measured in years; other units could 
be taken as well. 

For a random diffusion process, a more meaningful normalization for the derivative is 
5 to take D[x] = A[x]f\fx/Ty. For a space of processes as in Section 3.4, such that eq. (9) 
holds, the basic scaling behavior with x is eliminated, namely E[(D[x]z) 2 ] = o 2 . More 
generally, we can define a y-derivative as 

D[z,y] = • (36) 

(r/lyf 

10 

In a preferred embodiment, we use 
y = 0 differential, 
y = 0.5 stochastic diffusion process, (37) 

y = l the usual derivative. 

15 An empirical probability density function for the derivative is displayed in FIG. 8, 

which plots the annualized derivative D [x, 7 = 0.5; x] for USD/CHF from 1 Jan 1988 to 1 
Nov 1998. The shorter time intervals x correspond to the most leptocurtic curves. In order to 
discard the daily and weekly seasonality, the computations are done on the business 6>-time 
scale according to (Dacorogna et al., 1993). The data was sampled every 2 hours in 0-time to 
20 construct the curves. The Gaussian pdf, added for comparison, has a standard deviation of 
a = 0.07, similar to that of the other curves. The main part of the scaling with x is removed 
when the y-derivative with y = 0.5 is used. 

4.7 Volatility 

25 Volatility is a measure widely used for random processes, quantifying the size and 

intensity of movements, namely the "width" of the probability distribution P(Az) of the 
process increment Az ; where A is a difference operator yet to be chosen. Often the volatility 
of market prices is computed, but volatility is a general operator that can be applied to any 
time series. There are many ways to turn this idea into a definition, and there is no unique, 

30 universally accepted definition of volatility in finance. The most common computation is the 
volatility of daily prices. Volatility [x], evaluated for a regular time series in business time, 
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with a point-wise price difference r, = Ax, = - x(t, - x') and x' = 1 day. The time horizon 
x' of the return is one parameter of the volatility; a second parameter is the length x of the 
moving sample used to compute the "width." The most common definition for the width 
estimator uses an L 2 norm: 



where RTS[t'; z] is an artificial regular time series, spaced by x', constructed from the 
irregular time series z (see Section 5.3). The operator 8 computes the difference between 

10 successive values (see Section 5.4). 

The above definition suffers from several drawbacks. First, for inhomogeneous time 
series, a synthetic regular time series must be created, which involves an interpolation 
scheme. Second, the difference is computed with a point-wise difference. This implies some 
noise in the case of stochastic data. Third, only some values at regular time points are used. 

15 Information from other points of the series, between the regular sampling points, is thrown 
away. Because of this information loss, the estimator is less accurate than it could be. 
Fourth, it is based on a rectangular weighting kernel (all points have constant weights of 
either \ln or 0 as soon as they are excluded from the sample). A continuous kernel with 
declining weights leads to a better, less disruptive, and less noisy behavior. Finally, by 

20 squaring the returns, this definition puts a large weight on large changes of z and therefore 
increases the impact of outliers and the tails of P(z). Also, as the fourth moment of the 
probability distribution of the returns might not exist {see Miiller U.A., Dacorogna M.M.. and 
Pictet O.V., Heavy tails in high-frequency financial data, in "A practical guide to heavy tails: 
Statistical Techniques for Analysing Heavy Tailed Distributions," edited by Robert J. Adler, 

25 Raisa E. Feldman and Murad S. Taqqu, published by Birkhauser, Boston 1998) (hereinafter 
Miiller et al., 1998), the volatility of the volatility might not exist either. In other words, this 
estimator is not very robust. There are thus several reasons to prefer a volatility defined as 
an L l norm: 

, N-l 

30 Volatility [r, r';z] = — £ |A[RTS[t';zTJ.| , withT=iW. (39) 



5 





(38) 



N U 
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There are again many ways to introduce a better definition for inhomogeneous time 
series. These definitions are variations of the following one, used in a preferred embodiment: 

Volatility [%z / ,p;z'\ = MNorm [t/2,p;A[t';z]] (40) 

5 

where the moving norm MNorm is defined by eq. (32) and the differential operator A of 
eq. (33) is used. Let us emphasize that a homogeneous time series is not needed, and that this 
definition can be computed simply and efficiently for high-frequency data because it 
ultimately involves only EMAs. Note the division by 2 in the MNorm of range x/2. This is to 

1 0 attain an equivalent of the definition (38) which is parametrized by the total size rather than 
the range of the (rectangular) kernel. 

The variations of definition (40) used in alternate embodiments include, first, 
replacing the norm MNorm by a moving standard deviation MSD, as defined by eq. (32). 
This modification subtracts the empirical sample mean from all observations of A[x'; z]. This 

1 5 is not standard for volatility computations of prices in finance, but might be a better choice 
for other time series or applications. Empirically, for most data in finance (e.g., FX data), the 
numerical difference between taking MNorm and MSD is very small. The second variation is 
replacing the differential A by a y-derivative £>[x, y]. The advantage of using the gamma 
derivative is to remove the leading x dependence (for example, by directly computing the 

20 annualized volatility, independently of x). An example is given by FIG. 9, which shows the 
annualized volatility computed as MNorm [x/2; D [x/32, y = 0.5; x]] with x = lh. The norm is 
computed with p = 2 and n = 8. The plotted volatility has 5 main maxima corresponding to 
the 5 working days of the example week. The Tuesday maximum 910 is higher than the 
others, due to the stock crash mentioned above. 

25 Let us emphasize that the volatility definition (38) depends on the two time ranges x 

and %' and, to be unambiguous, both time intervals must be given. Yet, for example when 
talking about a daily volatility, the common terminology is rather ambiguous because only 
one time interval is specified. Usually, the emphasis is put on x'. A daily volatility, for 
example, measures the average size of daily price changes, i.e., x' = 1 day. The averaging 

30 time range x is chosen as a multiple of x', of the order x > x' up to x = 1 000 x' or more. 

Larger multiples lead to lower stochastic errors as they average over large samples, but they 
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are less local and dampen the possible time variations of the volatility. From empirical 
studies, one can conclude that good compromises are in the range from x = 16 x' to t = 32 x'. 

On other occasions, for example in risk management, one is interested in the 
conditional daily volatility. Given the prices up to today, we want to produce an estimate or 
5 forecast for the size of the price move from today to tomorrow (i.e., the volatility within a 
small sample of only one day). The actual value of this volatility can be measured one day 
later; it has x = 1 day by definition. In order to measure this value with an acceptable 
precision, we may choose a distinctly smaller x', perhaps x' = 1 hour. Clearly, when only one 
time parameter is given, there is no simple convention to remove the ambiguity. 

10 

4.8 Standardized time series £, moving skewness and kurtosis 

From a time series z, we can derive a moving standardized time series 

= z -MA[t;z] _ 
MSD[t;z] 

15 

In finance, z typically stands for a time series of returns rather than prices. 

Once a standardized time series f[x] has been defined, the definitions for the moving 
skewness and the moving kurtosis are straightforward: 

MSkewnessfTj, z 2 ; z] = MAfr^ z[t 2 ] 3 ], 
20 MKurtosistT^T^z] = MA[t j; z[x 2 ] 4 ]. 

The three quantities for our sample week are displayed in FIG. 10, which shows plots of the 
standardized return 1020, moving skewness 1040, and moving kurtosis 1060. The returns are 
computed as r = D [x = 15 minutes; x] and standardized with ti = x 2 = 24h. 

25 

4.9 Moving correlation 

Several definitions of a moving correlation can be constructed for inhomogeneous 
time series. Generalizing from the statistics textbook definition, we can write two simple 
definitions: 

30 

MCorrelationJr^z] - MA[(y - MA[y])(z - MA[z]) ]/(MSD[y] MSD[z]) , (43) 
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, . r i WAr (y - MAM)(z - MA[z]), 
MComlatHrcjT;,.*] = MA[ ^ MSD|>] MSD[z]) 1 (44) 

= MA[j? f ] , 

5 where all the MA and MSD operators on the right hand sides are taken with the same decay 
constant x. These definitions are not equivalent because the MSD operators in the 
denominator are time series that do not commute with the MA operators. Yet both 
definitions have their respective advantages. The first definition obeys the inequality -1 < 
MCorrelationj < 1 . This can be proven by noting that MA[z 2 ](0 for a given t provides a norm 

1 0 on the space of (finite) time series up to /. This happens because the MA operator has a 
strictly positive kernel that acts as a metric on the space of time series. In this space, the 
triangle inequality holds: y , MA[(y +zf] < \/MA\y 2 ] + \/UA[z 2 ] ,and, by a standard 
argument, the inequality on the correlation follows. With the second definition (44), the 
correlation matrix is bilinear for the standardized time series. Therefore, the rotation that 

15 diagonalizes the correlation matrix acts linearly in the space of standardized time series. This 
property is necessary for multivariate analysis, when a principal component decomposition is 
used. 

In risk management, the correlation of two time series of returns, x andy, is usually 
computed without subtracting the sample means of x andy. This implies a variation of 
20 eqs. (43) and (44): 

MCorrelation/lr^z] = MA[(y z]/(MNorm|>] MNorm[z]), (45) 



MCorrelation 2 ' [x;y, z] = MA — — y — (46) 

2 [(MNorm[y] MNorm[z]J 

25 

where again the same t is chosen for all MA operators. 

In general, any reasonable definition of a moving correlation must obey 



lim MCorrelation[r ;y,z] p\y,z] (47) 

30 
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where p[y, z] is the theoretical correlation of the two stationary processes x andy. 
Generalizing the definition (44), the requirements for the correlation kernel are to construct a 
causal, time translation invariant, and a linear operator for y and z . This leads to the most 
general representation 

5 

MCorrelationCMKO - T P dt' dt" c(f ,t")y(t-t') z(t- t") . (48) 
Jo Jo 

We also require symmetry between the arguments: MCorrelationfi , y] = 
MCorrelation[y, £]. Moreover, the correlation must be a generalized average, namely 
10 MCorrelation[Const, Const'] = ConstConst', or for the kernel, dt' dt" c(t' , t") = 1. 

There is a large selection of possible kernels that obey the above requirements. For example, 
eq. (44) is equivalent to the kernel c(t', t") = 8(f - t") ma ^ + t * , but other choices might 
be better than this one. 

15 4.10 Windowed Fourier transform 

In order to study a time series and its volatility at different time scales, we want to 
have a method similar to wavelet transform methods, yet adapted to certain frequencies. As 
with wavelet transforms, a double representation in time and frequency is needed, but an 
invertible transformation is not needed here because our aim is to analyze rather than further 
20 process the signal. This gives us more flexibility in the choice of the transformations. 

A single causal kernel with the desired properties is or is similar to ma[x](/) sin(Mx). 
Essentially, the sine part is (locally) analyzing the signal at a frequency klx and the ma part is 
taking a causal window of range x. In order to obtain a couple of oscillations in the window 
2t, choose k between k ~ n and k ~ 5te. Larger k values increase the frequency resolution at 
25 the cost of the time resolution. The basic idea is to compute an EMA with a complex t; this 
is equivalent to including a sine and cosine part in the kernel. The advantageous 
computational iterative property of the moving average is preserved. 

The first step is to use complex iterated EMAs. The kernel of the complex ema is 
defined as 

30 ema[Q(0 = — , where Q = - (1 + ik) , (49) 

T T 
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and where £ is complex (£ e C) but t is again a real number. The choice of the normalization 
factor 1/x is somewhat arbitrary (a factor \C,\ will produce the same normalization for the real 
case k = 0) but leads to a convenient definition of the windowed Fourier kernel below. By 
using the convolution formula, one can prove iteratively that the kernel of the complex 
5 EMAK, n] is given by 

emaK,K](0 = —L- [ A ^ e — , (50) 
(w-1)! \t) t 

which is analogous to eq. (26). The normalization is such that, for a constant function c(t) = 
10 c, 

EMAK,n;c] = • (51) 

(i + iky 

Using techniques similar to those applied to eq. (23), we obtain an iterative computational 
formula for the complex EMA: 

v - ** + * 1 " v with 



!5 EMAK;z](0 = n EMAfez]^.,) + V i J^r +z n J—j^ 

« = W„ - t n _ x ) 
u = e~ a 



(52) 



where v depends on the chosen interpolation scheme as given by eq. (24). 

We define the (complex) kernel wf(f) of the windowed Fourier transform WF as 

wf[r, k, n](f) - ma[r,n](r) e 



« £f 0' - 1)! { r) r (53) 
I ^emaK,j](0 . 



25 



FIG. 11 plots the kernel wf(0 for the windowed Fourier operator WF, for n = 8 and k 
= 6. Three aspects of the complex kernel are shown: (1) the envelope 1120 (= absolute 
value), (2) the real part 1140 (starting on top), and (3) the imaginary part 1160 (starting at 
zero). Another appropriate name for this operator might be CMA for Complex Moving 
30 Average. The normalization is such that, for a constant function c{t) = c, 
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N WT = WFK,« ;c] 



n J= i (1 + ik) J 



In order to provide a more convenient real quantity, with the mean of the signal 
5 subtracted, we can define a (non-linear) normed windowed Fourier transform as 



NormedWFK,«;z] = \WF[^n;z] - N WF MA[t,«;z]| . 



(54) 



The normalization is chosen so that 

10 

NormedWFK,ra;c] = 0 

Note that in eq. (54) we are only interested in the amplitude of the measured 
frequency; by taking the absolute value we have lost information on the phase of the 
15 oscillations. 

FIG. 12 shows an example of the normed windowed Fourier transform for the 
example week, wherein the normed windowed Fourier transform is plotted, with x = 1 hour, k 
= 6, and n =8. The stock market crash is again nicely spotted as the peak 1220 on 
Tuesday 28. 

20 Using the described methods, other quantities of interest can be easily calculated. For 

example, we can compute the relative share of a certain frequency in the total volatility. This 
would mean a volatility correction of the normed windowed Fourier transform. A way to 
achieve this is to divide NormedWF by a suitable volatility, or to replace z by the 
standardized time series £ in eq. (54). 



In a preferred embodiment, the techniques described above are implemented in a 
method used to obtain predictive information for inhomogeneous financial time series. Major 
steps of the method (see FIG. 13) comprise the following: At step 1310 financial market 
30 transaction data is electronically received by a computer over an electronic network. At step 
1320 the received financial market transaction data is electronically stored in a computer- 



25 



5 



Implementation 
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readable medium accessible to the computer (e.g., on a hard drive, in RAM, or on an optical 
storage disk). At step 1330 a time series z is constructed that models the received financial 
market transaction data. At step 1340 an exponential moving average operator is constructed, 
and at step 1350 an iterated exponential moving average operator is constructed that is based 
5 on the exponential moving operator constructed in step 1340. At step 1360, a time- 
translation-invariant, causal operator Q.[z] is constructed that is based on the iterated 
exponential moving average operator constructed in step 1350. Q[z] is a convolution 
operator with kernel co and time range x. (It is important to note, with respect to the steps 
described herein, that no particular order, other than that order required to make the method 

10 practical, should be inferred from the fact that the steps are described and labeled in a certain 
order; the order has been chosen merely for ease of explication.) At step 1370 values of one 
or more predictive factors relating to said time series z are calculated by the computer. The 
predictive factors are defined in terms of the convolution operator At step 1380 the 
values of one or more predictive factors calculated by the computer are stored in a computer 

15 readable medium (not necessarily the same medium as in step 1320). 

Various predictive factors have been described above, and specifically comprise 
return, momentum, and volatility. Other predictive factors will be, apparent to those skilled in 
the art. 

In a second preferred embodiment, the major steps of the method differ from those 
20 described in connection with FIG. 13. This second preferred embodiment is illustrated in 
FIG. 14. Major steps of the method comprise the following: At step 1410 financial market 
transaction data is electronically received by a computer over an electronic network. At step 
1420 the received financial market transaction data is electronically stored in a computer- 
readable medium accessible to the computer. At step 1430 a time series z is constructed that 
25 models the received financial market transaction data. At step 1440 an exponential moving 
operator is constructed, and at step 1450, an iterated exponential moving operator is 
constructed that is based on the exponential moving operator constructed in step 1440. At 
step 1460, a time-translation-invariant, causal operator Cl[z] is constructed that is based on the 
iterated exponential moving average operator constructed in step 1450. Q[z] is a convolution 
30 operator with kernel o> and time range x. At step 1470 a standardized time series f is 

constructed (see Section 4.8). At step 1480 values of one or more predictive factors relating 
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to said time series z are calculated by the computer. The predictive factors in this case are 
defined in terms of the standardized time series f . At step 1490 the values of one or more 
predictive factors calculated by the computer are stored in a computer readable medium (not 
necessarily the same medium as in step 1420). In addition to the predictive factors mentioned 
5 above, additional predictive factors relevant to this method are moving skewness and moving 
kurtosis. 

In a third preferred embodiment, the major steps are similar to those described above, 
but differ enough to call for separate explanation. In this embodiment, illustrated in FIG. 15, 
the steps are as follows: At step 1510 financial market transaction data is electronically 

10 received by a computer over an electronic network. At step 1520 the received financial 
market transaction data is electronically stored in a computer-readable medium accessible to 
the computer. At step 1530 a time series z is constructed that models the received financial 
market transaction data. At step 1540 an exponential moving average operator is constructed, 
and at step 1550 an iterated exponential moving average operator is constructed that is based 

15 on the exponential moving operator constructed in step 1540. At step 1555, a time- 
translation-invariant, causal operator Q[z] is constructed that is based on the iterated 
exponential moving average operator constructed in step 1550. Q[z] is a convolution 
operator with kernel co and time range x. At step 1550 an exponential moving average 
operator EMA[x ; z] is constructed, and at step 1560 a moving average operator MA is 

20 constructed. MA depends on EMA (see Section 4.3). At step 1565 a moving standard 
deviation operator MSD is constructed. MSD depends on MA (see Section 4.4). At step 
1570 values of one or more predictive factors relating to said time series z are calculated by 
the computer. The predictive factors are defined in terms of one or more of EMA, MA, and 
MSD. At step 1580 the values of one or more predictive factors calculated by the computer 

25 are stored in a computer readable medium (not necessarily the same medium as in step 1520). 

In a fourth preferred embodiment of the invention, the major steps (illustrated in FIG. 
16) are again similar to those described above, but merit separate description. At step 1610 
financial market transaction data is electronically received by a computer over an electronic 
network. At step 1620 the received financial market transaction data is electronically stored 

30 in a computer-readable medium accessible to the computer. At step 1630 a time series z is 
constructed that models the received financial market transaction data. At step 1640 a 
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complex iterated exponential moving average operator EMA[x ; z\ with kernel ema, is 
constructed (see Section 4.10). At step 1650 a time-translation-invariant, causal operator 
Q[z] is constructed. Q[z] is a convolution operator with kernel w and time range x, and is 
based on the operator constructed in step 1640. At step 1660 a windowed Fourier transform 

5 WF is constructed (WF is defined in terms of EMA; see Section 4.10). At step 1670 values 
of one or more predictive factors relating to said time series z are calculated by the computer. 
The predictive factors are defined in terms of the windowed Fourier transform WF. At step 
1680 the values of one or more predictive factors calculated by the computer are stored in a 
computer readable medium (not necessarily the same medium as in step 1620). 

1 0 Although the subject invention has been described with reference to preferred 

embodiments, numerous modifications and variations can be made that will still be within the 
scope of the invention. No limitation with respect to the specific embodiments disclosed 
herein is intended or should be inferred. 
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